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Title: Double-stranded DNA with cohesive end(s), and method of 
shuffling DNA using the same 

5 

INUSTRIAL FIELD 

The present invention relates to a double-stranded DNA 
with a cohesive end or cohesive ends having a desired sequence 
and a method for producing it, and also a method for shuffling 
10 a DNA using Dna blocks with a cohesive end or cohesive ends, 
the DNA as shuffled according to the method, a DNA pool to be 
obtained according to the shuffling method, and also a genetic 
product to be produced by the use of the DNA pool. 

15 BACKGROUND ART 

One approach to protein engineering for improving natu- 
rally-existing proteins to modified ones which are more useful 
to human beings is to improve proteins through site-specific 
mutation, which has produced some results (Japanese Patent 
20 Application Laid-open No. 5-91876) . However, this requires the 
clarification or identification of the stereostructure of the 
targeted protein, and much labor is needed for the analysis of 
the stereostructure. m addition, even though the stereostruc- 
ture could be clarified or identified, there are still many 
25 unknown matters for the relationship between the structure and 
the function with proteins. Therefore, it is still difficult 
to surely impart an intended function to the targeted protein. 
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In order to overcome these dif ficulties, a process compri- 
sing random mutation and screening and also evolutional 
molecular engineering that utilizes the evolution of organisms 
have been being highlighted and said to be extremely useful 
s (Proc. Natl. Acad. Sci., USA, £3, 576 (1986)). However, the 
current methods are directed to the substitution of at most 
several amino acids. 

In W095/22625, disclosed is a method for forming novel 
genes by dividing a plurality of genes at random and homolo- 
10 gously recombining them to reconstruct novel genes. However, 
this is one method for forming chimera genes. The genes to be 
formed by this method are similar to the original genes, and 
the former shall have the essential base sequences of the 
latter. 

is Using such known methods, it is difficult to desire the 

impartation of some additional functions to organisms which 
they could not gain during the steps of their evolution. In 
order to obtain genetic products, of which the functions are 
greatly different from those of naturally-existing substances 

20 such as proteins, it is believed effective to prepare a pool of 
nucleic acids having significantly different base sequence 
spaces from those existing naturally, and to produce from them 
genetic products having the intended functions. 

One method for this may be to prepare a nucleic acid pool 

25 that covers all base combinations. However, even the total 
number of the base sequences that may code for a relatively 
small protein with 100 amino acids (300 bp) is an enormous 



Printed from Mimosa 12/15/1999 



WO 98/05765 



3 



PCTVDK97/O0317 



number of 4 300 (about 10 180 ) , and it is in fact impossible to 
prepare the nucleic acid pool that may cover all of them. 

For proteins of some kinds, their sub-structures which are 
referred to as modules were specifically noted, and an attempt 
5 was made to change the sequencing of the base sequence blocks 
corresponding to the individual modules to thereby produce 
mutants having different module sequences (Viva Origin© , Vol. 
22, No. 1 (1995) 86-87). In this attempt, however, the base 
sequences were re-sequenced merely individually for the 
10 individual mutants. No one has heretofore attempted the forma- 
tion of a nucleic acid pool covering all re-sequenced molecules 
and the collection of genes capable of expressing products ha- 
ving intended properties from the pool. 

Utilizing restriction enzymes, it is possible to prepare a 
is nucleic acid pool including various molecules by blending seve- 
ral kinds of DNA blocks having the same cohesive end or blunt 
end followed by ligating them at random, and to select from 
this pool some molecules having desired properties. According 
to this method, however, the DNAs to be used must have prede- 
20 termined restriction enzyme recognizing sites. Even though the 
DNAs have such restriction enzyme recognizing sites, there is 
an extremely small probability that the sites are positioned at 
the desired sites. In this method, in addition, the both ends 
of the blocks must be of the same type, and there is a high 
25 probability that the blocks are therefore self-ligated. A 
means of forming the restriction enzyme recognizing sites 
through site-specific mutation may be taken in order to over- 
come these problems. However, the matter as to whether or not 
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the blocks can be ligated in accordance with the predetermined 
frame is likely much governed by chance. That is, the matter 
as to whether or not a desired protein can be produced without 
misreading the reading frame of the codon shall be governed by 
5 chance. Therefore, the method is extremely inefficient. 

The subject matter of the present invention is to provide 
a method for efficiently obtaining base sequences that exist in 
spaces greatly different from those of naturally-existing base 
sequences, and also to provide genetic products to be obtained 
10 by expressing, , as genes, the nucleic acid sequences that are 
obtained in that manner and that do not exist naturally. 

The sequence space of a gene includes the full-length 
sequence thereof to be theoretically constituted by a combina- 
tion of four bases, A, G, c and T. For example, a base sequen- 
15 ce that codes for a protein composed of a number »n" of amino 
acids shall be constructed by selecting and sequencing any de- 
sired one of the four bases for a total of 3n-times, therefore 
including 4*n combinations. Accordingly, a protein composed of 
100 amino acids shall include different base sequences of about 
20 10 18 types as so mentioned hereinabove. 

In fact, there is no limitation for the number of amino 
acids that constitute proteins. Therefore, the sequencing spa- 
ces for proteins shall extend unlimitedly. During the steps of 
evolution of organisms, only a part of such sequencing spaces 
25 have been examined, and there is a great probability that some 
sequences coding for proteins which may have some extremely 
excellent functions could exist in the other great sequencing 
spaces. The protein engineering studies which have been and 
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are being made in many laboratories and institutes at present 
are essentially directed to the creation of novel proteins 
having functions superior to those of naturally-existing 
proteins, and one essential approach made therein to this pur- 
5 pose is to substitute amino acids in existing sequences, as so 
mentioned hereinabove. However, the amino acid substitution is 
nothing but the essential means that organisms have carried out 
during the steps of their evolution or, that is, such is the 
imitation of organisms and is to search only around the 
10 sequences that, organisms already examined. In addition, there 
is a probability that the sequences thus obtained will be those 
that were already weeded out in the past. 

We, the present inventors have considered that, in order 
to be greatly apart from the sequencing spaces that organisms 
15 already examined, if we carry out such matters that could not 
have been carried out by organisms, the purpose will be 
attained. We know that the division of a gene into several 
blocks followed by the change in the sequencing of the thus-di- 
vided blocks, if occurred in organisms, shall kill the 
20 organisms. Therefore, we have concluded that this method is 
suitable for our purpose. Having thus concluded, we, the pre- 
sent inventors have assiduously studied various matters rela- 
ting to this method and, as a result, have found a method of 
forming a desired cohesive end or ends on a desired DNA. Uti- 
25 lizing this method, we have succeeded in a method of dividing a 
gene into several blocks and re-sequencing them into different 
sequences and also in a method of producing a molecule pool 
including such different base sequences existing in different 
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sequencing spaces, and thus have completed the present inven- 
tion. 

Accordingly, the present invention provides the following: 

1) A DNA with a cohesive end comprising (a) a double- 
5 stranded DNA having the same sequence as that of a part of a 

gene, and (b) a single-stranded DNA having a base sequence that 
exists on said gene at the site not adjoining the part cor- 
responding to said double-stranded DNA or a base sequence which 
said gene does not have, wherein the single-stranded DNA is 
10 linked to either one end of the double-stranded DNA to form a 
cohesive end. 

2) A DNA with cohesive ends comprising (a) a double- 
stranded DNA having the same sequence as that of a part of a 
gene, (b) a first, single-stranded DNA having a base sequence 

15 that exists on said gene at the site not adjoining the part 
corresponding to said double- stranded DNA or a base sequence 
which said gene does not have, and (c) a second, single- 
stranded DNA having a base sequence that exists on said gene at 
the site adjoining the part corresponding to said double- 

20 stranded DNA, wherein the second, single-stranded DNA is linked 
to said double-stranded DNA at one end corresponding to said 
adjoining site, while the first, single-stranded DNA is linked 
thereto at the other end of the complementary strand opposite 
to said end, thereby forming cohesive ends. 

25 3) The DNA with a cohesive end or cohesive ends according 

to the previous 1) or 2), wherein the single-stranded DNA has a 
length of 2 bases or more. 
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4) The DNA with a cohesive end or cohesive ends according 
to any one of the previous 1) to 3) , wherein the cohesive 
end/ends is/are positioned at the 3 ' -terminal /terminals. 

5) A method for producing a DNA with a cohesive end or 
s cohesive ends, wherein a part of a DNA, as a template, and an 

oligonucleotide containing at least one ribonucleotide, as a 
primer, are subjected to DNA polymerase reaction to prepare a 
double-stranded DNA, then the ribonucleotide (s) is/are removed 
through enzymatic reaction or chemical reaction, and the nu- 
10 cleotide(s) remaining at the 5 •-terminal (s) of the site(s) at 
which said ribonucleotide (s) existed are removed. 

6) A method for producing the DNA with a cohesive end of 
the previous 1) , comprising the following steps a) to d) : 

a) a step of linking (i) an oligonucleotide having the 
15 same base sequence as that of a part of a gene DNA to (ii) an 

oligonucleotide having a base sequence that exists on the gene 
at the site not adjoining the base sequence of (i) or a base 
sequence which the gene does not have, and containing at least 
one ribonucleotide, in such a manner that the oligonucleotide 
20 (ii) is positioned at the S'-terminal of the oligonucleotide 

(i); 

b) a step of preparing a double-stranded DNA through DNA 
polymerase reaction between a DNA containing the part cor- 
responding to the oligonucleotide (i) in said a) , as a tem- 

25 plate, and the linked oligonucleotide as obtained in the 
previous step a) , as a primer; 
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c) a step of removing the ribonucleotide from said double- 
stranded DNA through enzymatic reaction or chemical reaction; 
and 

d) a step of removing the nucleotide remaining at the 5 1 - 
s terminal of the site at which said ribonucleotide existed. 

7) A method for producing the DNA with cohesive ends of 
the previous 2) , comprising the following steps a) to d) : 

a) a step of linking (i) an oligonucleotide having the 
same base sequence as that of a part of a gene DNA to (ii) an 

10 oligonucleotide, having a base sequence that exists on the gene 
at the site not adjoining the base sequence of (i) or a base 
sequence which the gene does not have, and containing at least 
one ribonucleotide, in such a manner that the oligonucleotide 
(ii) is positioned at the 5' -terminal of the oligonucleotide 

is (i); 

b) a step of preparing a double-stranded DNA through DNA 
polymerase reaction between a DNA containing the part corre- 
sponding to the oligonucleotide (i) in said a), as a template, 
and (i) the linked oligonucleotide as obtained in the previous 

20 step a) and (ii) an oligonucleotide which is a complementary 
strand of an oligonucleotide existing on the gene at the site 
separated from said oligonucleotide-corresponding part by at 
least 3 bases or more toward the 3 1 -terminal and which contains 
at least one ribonucleotide, as primers; 

25 c ) a step of removing the ribonucleotides from said 

double-stranded DNA through enzymatic reaction or chemical 
reaction; and 
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d) a step of removing the nucleotides remaining at the 5 1 - 
terminals of the sites at which said ribonucleotides existed* 

8) A method for shuffling a DNA, comprising dividing a 
DNA into a plurality of DNA blocks each having a cohesive end 

5 or cohesive ends, followed by ligating them together into a 
sequence that is different from the sequence of the original, 
non-divided DNA. 

9) A method for shuffling a DNA, comprising applying the 
method of any one of the previous 5) to 7) to various sites of 

10 a DNA , thereby. dividing the DNA into a plurality of DNA blocks 
each having a cohesive end or cohesive ends, at least one block 
of which shall have a cohesive end that is complementary to the 
cohesive end of another block not having been directly adjacent 
to said one block on the original DNA, followed by ligating 

is them together into a sequence that is different from the 
sequence of the original, non-divided DNA, 

10) The shuffling method according to the previous 8) or 
9), wherein the DNA is divided into 3 or more blocks. 

11) The shuffling method according to any one of the 
20 previous 8) to 10) , wherein the blocks are ligated together 

using a DNA ligase. 

12) A DNA as shuffled according to the method of any one 
of the previous 8) to 11) • 

[0016] 

25 13) The DNA according to the previous 12), wherein a gene 

coding for an enzymatic function or a control gene for the gene 
is shuffled. 
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14) The DNA according to the previous 13), wherein the 
gene is a gene that codes for any one of proteases, lipases, 
cellulases, amylases, catalases, xylanases, oxidases, dehydro- 
genases, oxygenases and reductases* 
5 15) The DNA according to the previous 13) or 14), wherein 

the gene is one derived from prokaryotes. 

16) The DNA according to the previous 15), wherein the 
gene is one derived from bacillus bacteria. 

17) The DNA according to the previous 16) , wherein the 
10 gene is a protease API21 gene. 

18) A DNA pool containing plural kinds of DNAs having 
different structures that are obtained according to the 
shuffling method of any one of the previous 8 to 11) . 

19) The DNA pool according to the previous 18), which 
is contains 10 or more kinds of DNAs. 

20) A method for producing a DNA pool, comprising 
applying the method of any one of the previous 5) to 7) to 
various sites of a template DNA to thereby prepare a mixture of 
DNA blocks each having a cohesive end or cohesive ends that 

20 satisfies the following conditions, followed by ligating these 
into any desired sequences: 

Condition 1: Each block has a double-stranded site having 
the same sequence as that of a part of the template DNA. 

Condition 2: At least two of the blocks that constitute 
25 the block mixture further have, in addition to said double- 
stranded site, s single-stranded site (cohesive end) that is 
complementary to the cohesive end of blocks that are not 
directly adjacent to said blocks on the template DNA. 
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Condition 3: The block mixture contains at least two 
different blocks which are the same in the double-stranded site 
but are different only in the single-stranded site and which 
satisfy the condition 2. 
5 21) The method for producing a DNA pool according to the 

previous 20) , wherein the template DNA is a gene that codes for 
an enzymatic function or a control gene DMA for the gene. 

22) The method for producing a DNA pool according to the 
previous 21) , wherein the template DNA is a gene DNA that codes 

10 for any one of proteases, lipases, cellulases, amylases, 
catalases, xylanases, oxidases, dehydrogenases, oxygenases and 
reductases, 

23) The method for producing a DNA pool according to the 
previous 22), wherein the template DNA is one derived from 

15 prokaryotes. 

24) The method for producing a DNA pool according to the 
previous 23), wherein the template DNA is one derived from 
bacillus bacteria. 

25) The method for producing a DNA pool according to the 
20 previous 24), wherein the template DNA is a protease API21 

gene. 

26) The method for producing a DNA pool according to any 
one of the previous 20) to 25), wherein the DNA blocks are 
ligated together using a DNA ligase. 

25 2? ) A genetic product to be obtained by expressing the 

genetic information on DNA molecules that exist in the DNA pool 
of any one of the previous 18) to 26). 
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Now the present invention is described in detail 
hereinunder. 



OKA with Cohesive End(s) 

5 The present invention provides a DNA with any desired 

cohesive end or ends (herein referred to as "DMA with cohesive 
end(s)» unless otherwise specifically indicated). The cohesive 
end as referred to herein indicates a single-stranded site as 
protruded from the end of a double-stranded DNA. Such a 
IP cohesive end may be formed when a DNA is cleaved with a 
restriction enzyme such as EcoRl. m this case, however, the 
base sequence of the thus-formed cohesive end is defined, 
depending on the restriction enzyme used, and its length is 
generally composed of several bases or so. if a naturally- 
is existing DNA is cleaved with a restriction enzyme, the sequence 
of the resulting double-stranded part of the DNA is also 
limited to the region as sandwiched between the restriction 
enzyme recognizing sites. As opposed to this, the DNA with co- 
hesive end(s) of the present invention may have a structure in 
20 which a cohesive end or cohesive ends having a desired length 
and a desired sequence is/are added to the end or ends of a 
double-stranded DNA having a desired'sequence. 

As has been mentioned hereinabove, the sequence of the 
double-stranded part of the DNA with cohesive end(s) of the 
25 present invention is not specifically defined. For example, 
the sequence may be the same as that of a part of a gene. 
Though not also specifically defined, its length may be gene- 
rally composed of 50 base pairs (bp) or more, preferably 45 bp 
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or more. The sequence of the cohesive end is not also specifi- 
cally defined, but in order to prevent the self-ligation 
thereof in various reactions, it is preferable that the 
sequence does not form a stem structure. The "sequence to form 
5 a stem structure" as referred to herein includes, for example, 
AATT, which shall have just the same sequence as that of its 
complementary strand (TTAA) . The length of the cohesive end 
may be generally 2 bp or more, preferably from 15 bp to 30 bp. 
If the cohesive end is too long, it may form a secondary struc- 
10 ture whereby the intermolecular annealing will be difficult. 
However, if it is too short, its melting temperature (Tm) is 
low and the annealing will be unstable. 

The cohesive end may be linked to either the 3 1 -terminal 
or the S'-terminal of the double-stranded DNA, but is prefe- 
15 rably linked to the 3* -terminal thereof. The cohesive end may 
be linked to either only one terminal of the double-stranded 
DNA or the both terminals thereof. 

Method for Producing DNA with Cohesive End (a) 
20 The DNA with cohesive end(s) of the present invention can 

be produced typically according to a method comprising the 
following steps a) to d) . The method mentioned below is direc- 
ted to the production of a DNA with a cohesive end, which has a 
structure to be represented by a formula (2): 

is 

(2) 
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wherein X and c are desired sequences; a and * are 
sequences that are complementary to X and c, 
respectively; and as is a sequence of a cohesive end, 
and which is based on a double-stranded DNA (template DNA) 
s having a structure to be represented by a formula (l): 

(1) 

wherein X, c, a, S, and ^ have the same meanings as 
above . 

10 However, the present invention is not limited to only the 

production illustrated herein, but other DNAs with cohesive 
end(s) having other structures can also be produced in the same 
manner as below according to the present invention. 

15 Step a), 

(a-l) Preparation of Oligonucleotide: 

First, the part that shall be selected as the double- 
stranded part of the intended DNA with a cohesive end is 
defined on a template DNA. An oligonucleotide, a, which is 
20 complementary to its terminal, X, and an oligonucleotide, c, 
having the same sequence as that of the other terminal, c, are 
prepared. X and c each may have a sequence having a base 
length of from 15 to 30 bp or so. 

On the other hand, prepared is an oligonucleotide, b, 
2S which is complementary to the sequence to be prepared by 
removing one base (this is referred to as X) from the 5'- 
terminal of the sequence of the intended cohesive end, es. 
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The base sequence, &r r may be a part of the above-mentioned 
DNA or may be any other sequence that the DNA does not have. 

These oligonucleotides, a, b and c, may be prepared by any 
methods. If their sequences are previously known, they can be 
5 synthesized, using a known DNA synthesizer, 

fa-2) Preparation of Ribonucleotide-Containing Fragments: 

Next, the oligonucleotides, a and b, are linked together 
via a ribonucleotide. This linkage can be attained by ordinary 
10 synthesizing 'methods. For example, it can be attained 
according to the process mentioned below. 

First, a phosphoryl group is added to the 5» -terminal of 
the oligonucleotide, a, according to the reaction of the 
following formula (3) : 

15 (3) 
wherein (P) is a phosphoryl group. 
This reaction can be effected by the action of a polynucleotide 
kinase. ATP is used in an amount of from 2 to 10 times or so, 
by mol, relative to the oligonucleotide, a. The reaction 

20 temperature may be from 30 to 40°C or so. The reaction time 
may be from 10 minutes to 1 hour or so. Most suitably, the pH 
is from 7 to 9 or so. Aftex the addition of the phosphoryl 
group thereto, the oligonucleotide is represented by a'. 
[0024] 

25 On the other hand, a ribonucleotide is added to the 3 1 - 

terrainal of the oligonucleotide, b, according to the reaction 
of the following formula (4): 
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(4) 

wherein X is any one of ATP, GTP, CTP and OTP; (rX) is a 
ribonucleotide. 

This reaction can be effected by the action of, for 
5 example, a terminal deoxynucleotidyl transferase. For the 
nucleoside triphosphate (XTP) to be used herein, is selected a 
ribonucleotide that corresponds to the base X in the previous 
step (a-1) . The nucleoside triphosphate is used in an amount 
of from 2 to 10 times, by mol, relative to the oligonucleotide, 
10 b. The reaction temperature may be from 30 to 40°c or so. The 
reaction time may be from 30 minutes to 2 hours or so. After 
the addition thereto, the oligonucleotide is represented by b 1 . 
The sequence of b' is complementary to the sequence, 

The thus-obtained oligonucleotides, a 1 and b», are mixed, 
is whereby the 5' -terminal (phosphoryl group) of a * is bonded to 
the 3* -terminal (hydroxyl group) of the ribonucleotide of b 1 , 
according to the reaction of the following formula (5): 

(5). 

20 This reaction can be effected by the action of an RNA 

ligase in the presence of ATP and divalent metal ions (Japanese 
Patent Application Laid-open No. 5-292967). Divalent metal 
ions useful in this reaction include, for example, magnesium 
ions and manganese ions, but preferred are magnesium ions. As 

25 the ligase, employable is an RNA ligase. The RNA ligase is an 
enzyme to catalyze the ligation of the hydroxyl group at the 
3< -terminal and the phosphoryl group at the 5 1 -terminal, and 
this also efficiently catalyzes the ligation of a 
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polydeoxyribonucleotide having a ribonucleotide only at its 3'- 
terminal and a polydeoxyribonucleotide with a 5 1 -terminal phos- 
phoryl group. Preferably used is a T4 RNA ligase. The 
reaction is generally effected in a buffer, at a pH of from 7 
5 to 9 and at a temperature of from 10 to 40°C, over a period of 
from 30 to 180 minutes. For example, the oligonucleotides may 
be reacted in a solution comprising 50 mM Tris-HCl (pH 8.0), 20 
mM Mgcl 2 , 0.1 mM ATP , 10 mg/liter BSA, i mM hexaammine cobalt 
chloride (HCC) and 25 % polyethylene glycol 6000, at 25°C for 
10 60 minutes or -longer. 
Step b) 

Using the DNA containing the sequence, X, as prepared in 
the previous step (a-1) , as a template, and using the linked 
oligonucleotide, b^a*, as prepared in the previous step (a-2), 
15 as a primer, prepared is a double-stranded DNA through DNA 
polymerase reaction. in general, a double-stranded DNA con- 
taining the sequence, X, and a sequence, S, on their strands 
is subjected to thermal or alkaline denaturation to give 
single-stranded DNAs , which are added to the primer of b'-a 1 
20 and subjected to PCR with the oligonucleotide, c, as prepared 
in the previous step (a-1). The primer annealing condition and 
the polymerase reaction condition to be employed herein may be 
the same as those in ordinary polymerase reaction. The DNA 
polymerase to be employed herein may be any and every enzyme 
25 that can catalyze the DNA chain-extending reaction, which 
includes, for example, Taq polymerase, Klenow fragment, DNA 
polymerase I, etc. As a result of this reaction, obtained is a 
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double-stranded DNA with blunt ends, which is represented by 
the a formula (6) : 

(6). 

Step c) 

s Next f the ribonucleotide is removed from the double- 

stranded DNA through enzymatic reaction or chemical reaction. 
One example of useful enzymes is a ribonuclease. The reaction 
is generally effected at a pH of from 6 to 8 or so, at from 30 
to 70°C or so, over a period of from 10 to 60 minutes or so. 
10 As non-enzymatic chemicals usable herein, mentioned are sodium 
hydroxide and the like. As a result of this reaction, obtained 
is a partly-discontinuous, double-stranded DNA of the following 
formula (7), in which the part corresponding to the above- 
mentioned base, X, has been deleted. 

15 

(7). 

Step fl) 

After the above step, the nucleotide existing at the 5'- 
terminal of the above-mentioned deletion is removed. To remove 

20 this nucleotide, for example, the double-stranded DNA, from 
which the ribonucleotide has been removed in the previous step 
c), is heated at from 50 to 90°C or so. The polynucleotide 
that has been separated from the strand through this reaction 
can be removed, using, for example, a span column or the like. 

25 Thus is obtained the double-stranded DNA with a cohesive end of 
the above-mentioned formula (2) . 

In the process mentioned above, obtained is a double- 
stranded DNA with a cohesive end only at its one 3' -terminal. 



Printed from Mimosa 12/15/1999 



WO 98/05765 



19 



PCT/DK97/D03 1 7 



In the sane manner as this, also obtainable is a double- 
stranded DNA with cohesive ends at the both 3 • -terminals. 

In the above-mentioned process, a desired sequence, £*r, 
which does not adjoin the sequence, X, in the template DNA was 
5 introduced into the DNA to form the cohesive end. Apart from 
this, it is also possible to introduce thereinto an additional 
oligonucleotide that adjoin the sequence in the template DNA to 
form another cohesive end. For example, in the embodiment 
mentioned above, an oligonucleotide, c', which is different 
io from the oligonucleotide, c, in that its 3 '-terminal 
deoxyribonucleotide is substituted with a ribonucleotide, may 
be used as the primer in place of the oligonucleotide, c, to 
prepare a double-stranded DNA with two cohesive ends of a 
formula (8) : 

15 <S>. 
Method of Shuffling DNA 

The present invention also provides a method of shuffling 
a DNA, which is characterized by using DNAs with cohesive 
end(s). The terminology "shuffling" as referred to herein in- 
20 dicates the operation of dividing a DNA into plural blocks 
followed by re-sequencing them into a desired, different se- 
quence . 

For example, where one DNA has a sequence composed of a 
number, n, of blocks, as represented by a formula (9): 
25 . A - al - a2 - . . . . - a n - B (9) 

wherein the starting end A and/or the terminal end B may 
be omitted, 
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this nay be shuffled according to the present invention to give 
a different DNA to be represented by a formula (10) : 
A - al' - a2« - . . . . - ax - B ( i 0 ) 

wherein al'. a2« ax are blocks that are 

5 independently selected from the group of al, a2, 

a n ; and the total number of the blocks al', a2', . . . , 
a x may not be the same as the total number of the blocks 
al, a2, . . . , a n . 

The principle of the DNA shuffling of the present 
10 invention which utilizes DMAs with cohesive end(s) is gra- 
phically illustrated in Fig. i. in Fig. i, the DMA is shuffled 
at the intermediate part, pi - p2 - p3 (the uppermost row) into 
p3 - pi - p2 (the lowermost row), without changing the both 
ends p A and p B . This shuffling operation is useful as a method 
is for obtaining gene sequences that have not heretofore existed 
naturally, without changing the sequences of the promoter and 
the terminator. 

Concretely, the above-mentioned method of preparing DNAs 
with cohesive end(s) is applied first to the parts PA , pi, p2 , 

20 P 3 and p B constituting the template DNA, to thereby prepare DNA 
blocks, al, a2 and a3, each having the structure with two 
cohesive ends (formula (8)), and DNA blocks, A and B, each 
having the structure with one cohesive end (formula (2)). The 
cohesive ends, a A , aif , a 2 f and a 3f , are formed by removing the 

25 corresponding complementary strand from the blocks, p A , pi, p2 
and p3, respectively. 
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The cohesive ends, ai r , a 2 r, *2r and ae, are designed ac- 
cording to the desired sequence to be prepared after the 
shuffling. In the embodiment of Fig. l, the end, ai r is 
designed to be a complementary strand to the end, a3f, and the 
5 block, al is ligated to the block a3 after the shuffling. The 
ligation is conducted, using a DNA ligase in the presence of 
ATP. The type of the DNA ligase to be employed herein is not 
specifically defined. in this embodiment, since the single- 
stranded part of each cohesive end is long, it is unnecessary 
10 to employ the' ordinary reaction at 16°C, but a thermophilic DNA 
ligase is advantageously employed. 

In the embodiment of Fig. 1, a2r# *3r and a B are designed 
to be the complementary strands to aif, a A and a 2 f, respec- 
tively, in the same manner as above. As a result of the shuff- 
15 ling, a sequence having a structure of A - a3 - al - a2 - B is 
finally obtained. This is seemingly the same as the re- 
sequenced order of p A - p3 - pi - p2 - p B to be obtained by di- 
viding the original DNA into the constitutive blocks pi, p 2 , 
P3, pa and p B , followed by re-sequencing these into a different 
20 sequence. 

Any other desired sequences can be realized in the same 
manner as above. If the block, A or B, is made to have two 
cohesive ends, while the other blocks are made to have one co- 
hesive end, it is possible to obtain still different sequences 
25 through shuffling where the latter blocks with one cohesive end 
are positioned at the terminals. 

In the shuffling of the invention, it is also possible to 
introduce foreign DNA block(s) with cohesive end(s), which are 
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not in the original gene, into the gene DNA. For example, it 
is possible to shuffle two or more gene DMAs. In this case, 
the terminal of one gene, for example, the block A in the 
above-mentioned embodiment, may be processed into a DNA block 
5 with two cohesive ends, if desired. 

The blocks, which are the units to be shuffled, are oligo- 
nucleotides or polynucleotides composed of 2 or more 
nucleotides (hereinafter referred to as "oligonucleotides-,. 
In general, these are preferably composed of 30 or more 
io nucleotide units, more preferably 45 or more nucleotide units. 
The uppermost limit of the block length is not specifically 
defined, provided that the block length is shorter than the 
length of one gene. if, however, the block length is too 
large, the re-sequenced DNA to be obtained by the shuffling 
is shall have many non-mutated base sequence parts. Therefore, in 
general, the block length is preferably within the range of 
from 10 to 35 * of the length of a gene. 

Where the gene to be shuffled is a gene that codes for a 
protein, it is desirable that the gene blocks, oligonucleotides 
20 have the same reading frame before and after the division. 
Namely, the gene blocks to be shuffled are desirably so desig- 
ned that they are translated to always give the corresponding 
amino acid sequences, irrespective of their relative positions 
in the shuffled sequence. For this, the double-stranded parts 
25 and the cohesive ends shall be selected for their codon units 
in accordance with the reading frame of the gene DNA to be 
shuffled. 
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Needless-to-say, it is unnecessary to conduct the division 
into segment blocks with genetic meanings. Namely, it is unne- 
cessary to conduct the division of the gene DNA into the 
constitutive exons or segment blocks that correspond to the 
5 domains or modules of the protein which the gene DNA codes for. 
There is a probability that the shuffling at such sites would 
have been examined in the natural world in the past. In order 
to obtain base sequences that have not heretofore been examined 
in the natural world, it is desirable that the division of the 
10 gene DNA is effected inside the constitutive exons or at the 
sites corresponding to the inside of the domains or modules of 
the protein which the gene DNA codes for. 

Employing such means, therefore, it is possible to obtain 
proteins which have different structures as a whole from those 
is of natural proteins but which partly contain amino acid sequen- 
ces that have been confirmed to be useful in the natural world. 
Accordingly, the probability of obtaining useful proteins by 
such means is enlarged, as compared with the means of synthe- 
sizing proteins totally at random. 
20 The kind of the gene to be shuffled according to the pre- 

sent invention is not specifically defined. Employable herein 
is any and every gene that is composed of polynucleotide chains 
and contains a coding region necessary for expressing a protein 
or RNA. The nucleotide unit may contain any molecule of deoxy- 
25 ribonucleotides or ribonucleotides. For the purpose of finding 
out useful base sequences, preferred are genes coding for pro- 
teins, especially enzymes, or control genes for enzymatic 
functions. Examples of such enzymes include proteases, lipa- 
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ses, cellulases, amylases, catalases, xylanases, oxidases, 
dehydrogenases, oxygenases and reductases. 

The kind of the gene to which the present invention is 
directed is not specifically defined but shall be such that, 
s when it is introduced into a suitable host, the host can pro- 
duce the genetic product through expression of the gene. As 
examples, referred to are genes as cloned from living 
organisms, artificially synthesized genes, and even genes as 
cloned from living organisms and artificially mutated. For the 

10 genes derived from living organisms, employable are prokaryotes 
with definite enzyme producibility. As examples of such proka- 
ryotes, mentioned are bacillus bacteria. One example of the 
genes derived from such bacteria is a protease API21 gene 
derived from Bacillus NKS-21 (FERM BP-93-1) (Japanese Patent 

15 Application Laid-open No. 5-91876, Sequence Number l) . 

DNA Pool 

The present invention also provides a DNA pool to be 
obtained according to the above-mentioned shuffling method. 

20 The "DNA pool" as referred to herein means a high-density mix- 
ture of two or more DNAs . The DNA pool of the present in- 
vention can contain a particular number or more, for example, 
10 or more different DNA molecules having different structures. 
It is desirable that, when the mixture, DNA pool is directly 

25 used in biochemical operation or reaction, it is in such a form 
that all the plural nucleic acid components constituting it can 
be reacted. However, the form of the mixture, DNA pool is not 
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specifically defined, and the DNA pool may be either in 
solution or dry mixture. 

To produce the DNA pool, for example, a plurality of 
cohesive ends for each block are prepared in the above-men- 
5 tioned shuffling process. Referring to the embodiment of Fig. 
1, for example, when, for the cohesive end ai r of the block al, 
complementary strands to the other cohesive ends, aA and a2f, 
are prepared in addition to the complementary strand to a 3 f, 
then DNAs of A - al - a2 - B and A - al -a2 - al can be ob- 
io tained. if a complementary strand to the other cohesive end 
alf of al is added, it is also possible to produce other DNAs 
comprising a series of the same blocks, such as A - al - al - 
al. 

In the same manner, for the cohesive ends of a2 and a3, if 
15 oligonucleotides that are complementary to the cohesive ends of 
the other blocks or complementary to the other cohesive end of 
themselves are added, other sequences comprising these can be 
produced. 

In general, a DNA is divided into blocks of al, a2,~a3, . 
20 . . , a n . Then, each block is processed to have a cohesive end 
or cohesive ends according to the above-mentioned process. The 
cohesive ends are designed to be oligonucleotides that are 
complementary to the cohesive ends of the other blocks or are 
complementary to the other cohesive end of themselves. All or 
25 a part of the thus-obtained DNA blocks are mixed and ligated to 
each other, thereby producing a nucleic acid pool containing 
different nucleic acids composed of the blocks as differently 
sequenced at random. 
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Expression of Genetic Information in Shuffled DNA or DNA Pool 

The thus-shuffled, single or mixed, double-stranded DNAs 
are blunted. The blunting nay be omitted, if DNA blocks with 
one cohesive end are positioned at the ends of the shuffled, 
s double-stranded DNA. For example, the 5 '-terminal of the se- 
quence containing a dna block with a predetermined promoter se- 
quence, which is based on the direction of the promoter, is not 
made to have a cohesive end but is made to have a blunt end, 
while the 3 '-terminal of the sequence containing a DNA block 
10 with a predetermined terminator sequence, which is based on the 
direction of the terminator, is not made to have a cohesive end 
but is also made to have a blunt end. In that manner, it is 
possible to directly obtain a gene in which the blocks of the 
intended gene have been shuffled between the promoter and the 
terminator, without blunting it. After this, the thus-shuffled 
DNA is inserted into a desired vector, preferably an expression 
vector such as pKK223-3, using a DNA li gase . the promoter 
sequence and the terminator sequence to be in the shuffled DNA 
are not limited to only one each, but a plurality of promoter 
sequences and terminator sequences may be therein. 

If desired, the polynucleotide blocks positioned at the 
both ends of the shuffled DNA may be designed to have suitable 
restriction enzyme recognizing sites. in this case, the DNA 
may be ligated to a suitable vector, using the defined re- 
striction enzymes. 

Next, the vector library thus produced in the manner 
mentioned above is introduced into a suitable host, in which 
the genetic information is expressed. Thus, the intended gene- 
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tic product with favorable properties and also the gene coding 
for it can be obtained. Any and every ordinary host can be 
used herein. Preferred examples of the host include cells of 
E. coli, bacillus bacteria, yeasts, and lactic acid bacteria. 

If desired, in-vitro transcription systems and translation 
systems are also employable herein. In those cases, the gene- 
tic information can be expressed even when the gene is not 
ligated to a vector. 

The "genetic information" as referred to herein indicates 
the information on a gene which is carried by a DNA and which 
is translated into a protein or is transcribed into RNA in a 
suitable living body by the DNA for itself or after having been 
ligated to any other DNA or RNA. 

The genetic information that is expected to be expressed 
according to the method of the present invention is not 
specifically defined, but includes, for example, those on 
various genetic products, such as enzymes, antibodies, hormones 
receptor proteins and ribozymes, and those oh various control 
functions of, for example, operators, promoters and attenua- 
tors . 

Examples 

Now, the present invention is described in detail 
hereinunder with reference to the following examples, which, 
however, are not intended to restrict the scope of the present 
invention. 
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Examp l e i; — Product on of nwt 

A nucleic acid pool was produced in accordance with the 
process mentioned below, based on the wild-type alkali protease 
(Japanese Patent Application-Laid Open No. 5-91876) as cloned 
5 from a protease API21 (Bacillus NKS-21; FERM BP-93-1) having a 
sequence of sequence Number l. 

(1) Step a) : Preparation of Oligonucleotide Blocks for Primer 
(1-1) Synthesis of Oligonucleotide Blocks: 

Using an automatic DNA synthesizer, Model 392 
10 (manufactured , by Perkin Elmer Co.), synthesized were 14 
oligonucleotides; oligo FW (Sequence Number 2) , oligo RV 
(Sequence Number 3), oligo Ir (Sequence Number 4), oligo lb 
(Sequence Number 5), oligo la (Sequence Number 6), oligo 2r 
(Sequence Number 7), oligo 2b (Sequence Number 8), oligo 2a 
15 (Sequence Number 9), oligo 3r (Sequence Number 10), oligo 3b 
(Sequence Number 11) , oligo 3a (Sequence Number 12) , oligo 4r 
(Sequence Number 13), oligo 4b (Sequence Number 14), oligo 4a 
(Sequence Number 15) and oligo A (Sequence Number 16) . These 
are parts of the base sequence of API21 (Japanese Patent 
20 Application Laid-open No. 5-91876) (including complementary 
strands), or oligonucleotides containing a part of the base 
sequence. However, the sequence ' of oligo 4a is to follow 
glutamine of Sequence Number l and, and this contains a 
termination codon of the gene. These oligonucleotides were so 
25 designed that they might be the best when the oligo A was 
overhung on the 3 '-terminal of the amplified DNA in the 
experiment to follow hereinunder, using a Taq polymerase. 
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These oligonucleotides were synthesized in a DM trityl-on 
condition (that is, while the 5 • -hydroxy 1 group was protected 
with dimethoxytrityl group) , and purified through an OPC 
column. The reagents used herein were obtained from Perkin 
5 Elmer Co. 

(1-2) Addition of Ribonucleotide to Blocks: 

Next, 500 pmols of oligo lr, 1 nmol of ATP and 10 units of 
terminal deoxynucleotidyl transferase were added to a standard 
solution comprising: 
10 50 mM Tris-HCl buffer (pH 8.0) 

10 mM MgCl 2 

5 mM DTT (dithiothreitol) 
25 % PEG 6000 

1 mM HCC (hexaammine cobalt chloride) 
is io Mg/ml BSA (bovine serum albumin) , 

to thereby make 10 M l in total. The resulting solution was 
left at 37 °C for l hour. 

Oligo 2r, oligo 3r, oligo 4r, oligo lb, oligo 2b, oligo 3b 
and oligo 4b were processed in the same manner as above. These 
20 four polynucleotides thus formed are referred to as oligo lr ■ , 
oligo 2r', oligo 3r' f oligo 4r', oligo lb', oligo 2b', oligo 
3b 1 and oligo 4b 1 . 
(1-3) Phosphorylation: 

500 pmols of oligo la, 1 nmol of ATP and 10 units of 
25 polynucleotide kinase were dissolved in the standard solution 
having the same composition as above to make io M l i n total. 
The resulting solution was left at 3 7_c for l hour. Oligo 2a, 
oligo 3a and oligo 4a were processed in the same manner as 
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above. These polynucleotides thus formed are referred to as 
oligo la', oligo 2a oligo 3a' and oligo 4a'. 
(1-4) Ligation of Oligonucleotide Blocks: 

500 pmols of oligo la', 100 pmols of oligo lb', 100 pmols 
s of oligo 2b', 100 pmols of oligo 3b', 100 pmols of oligo 4b- , 
which had been obtained in the above, as well as 1 nmol of ATP 
and 50 units of T4 RNA ligase were added to the same standard 
solution as that mentioned above to make 10 M l in total, and 
these were reacted at 25°c for 4 hours. 

The other- combinations, oligo 2a' with oligo lb', oligo 
2b', oligo 3b' and oligo 4b'; oligo 3a' with oligo lb', oligo 
2b', oligo 3b' and oligo 4b'; and oligo 4a' with oligo lb', 
oligo 2b', oligo 3b' and oligo 4b', were also reacted in the 
same manner as above. A mixture of the four polynucleotides 
15 thus formed as a result of this reaction, oligo la' ligated to 
oligo lb', oligo 2b', oligo 3b' and oligo 4b', is referred to 
as oligo 1M; a mixture of the four polynucleotides, oligo 2a' 
ligated to oligo lb', oligo 2b', oligo 3b' and oligo 4 b' , is 
referred to as oligo 2M; a mixture of the four polynucleotides, 
20 oligo 3a' ligated to oligo lb', oligo 2b', oligo 3b' and oligo 
4b' , is referred to as oligo 3M; and a mixture of the four 
polynucleotides, oligo 4a' ligated to oligo lb', oligo 2b', 
oligo 3b' and oligo 4b', is referred to as oligo 4M. 
(2) steps b) to d) : Formation of Gene Blocks 

A template, plasmid pSDT812 (Japanese Patent Application 
Laid-open Mo. 1-141596), which had been prepared by inserting, 
into the Clal cleaving site of P HSG396, the gene of the wild- 
type alkali protease as cloned from Baciilus NKS-21, was 
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subjected to PCR with primers, oligo 1M and oligo 2r». The 
gene fragment as amplified through this reaction was treated 
with a ribonuclease, and then heated at 80_C for 5 minutes, 
whereby the polynucleotide (s) positioned at the 5 f -terminal of 
5 the ribonucleotide existing in the both strands or one strand 
was/were removed. As a result of this, prepared was a gene 
block with cohesive end(s). This gene block is referred to as 
block 1M. 

The other four combinations, oligo 2M and oligo 3r', oligo 
10 3M and oligo At' , oligo 4M and oligo RV, and oligo FW and oligo 
lr 1 , were processed in the same manner as above. These blocks 
thus prepared are referred to as block 2M, block 3M, block B, 
and block F, respectively. 



15 Sample 2: Shuffling 

Block. 1M, block 2M, block 3M, block B and block F of the 
same amount were blended and ligated together, using Pfu DNA 
ligase. 

After the ligation, the reaction mixture was subjected to 
20 agarose gel electrophoresis, through which was collected the 
DNA fragment of about 1.5 kbp. 

Example 3: Identification of Nurteic Acid p^i 

The thus-collected DNA of about 1.5 kbp was digested with 
!5 restriction enzymes, EcoRI and BamHI, then mixed with a 
plasmid, pHY300PLK (manufactured by Yakulto Honsha Co.)/ which 
had been digested with restriction enzymes, EcoRI and BamHI and 
processed with an alkali phosphatase, and thus ligated 
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together, using a ligation kit (manufactured by Takara Shuzo 
Co.). Using the resulting DMA, cells of E. coli JM105 were 
transformed, from which were selected tetracycline-resistant 
transfomants. From these transformants, plasmid DNAs were 
s extracted, purified and analyzed according to ordinary methods. 
Thus were obtained 97 clones with a DHA of 1.5 kbp as inserted 
between the EcoRI and BamHI recognizing sites of P HY300PLK. 

The base seguences of these DNAs thus obtained in the 
manner mentioned above were seguenced to analyze how block 1M, 

10 block 2M, block 3M, block P and block B were ligated in what 
order or, that is, how these were shuffled. As in the 
principle, block P was positioned at the first site while block 
B at the fifth site, and block 1M, block 2M and block 3M were 
shuffled between the two. Table 1 shows different types of 

15 shuffling, and the number of clones with each type. 
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Table l 



Type of Shuffling 


Number 
of 


Type of Shuffling 


Number of 
Clones 


111 


2 


223 


2 


112 


5 


231 


5 


113 


2 


232 


2 


121 


3 


233 


3 


122 


4 


311 


2 


123 


7 


312 


6 


131 


4 


313 


5 


132 


5 


321 


7 


133 


3 


322 


2 


211 ' ' 


1 


323 


5 


212 


5 


331 


2 


213 


4 


332 


5 


221 


1 


333 


2 


222 


3 







As in the above, it has been confirmed that, if three 
blocks of one gene are shuffled according to the method of the 
present invention, a nucleic acid pool is obtained that covers 
all combinations of clones each containing the same or 
different three of these blocks. 

Example 4; Screening of Gene tic Products Obtained from Nucleic 

Acid Pool 

The DNAs as produced in Example 3 were mixed. Using the 
resulting DNA mixture, cells of Bacillus subtills UOTO999 were 
transformed, Tetracycline-resistant transf ormants were selec- 
ted. 300 transformants were replicated on a skim milk-con- 
taining medium plate, on which were found clear zones around 
the colonies of 12 transformants. Accordingly, it is under- 
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stood that the enzyme which the shuffled gene codes for can be 
selected depending on its activity. The base sequences of 
these 12 clones that formed the clear zones were analyzed, fro* 
which it was found that these were sequenced in the same order 
5 of blocks as in the wild-type enzyme. 

Exanpl* Detection » f ctiwH-jr. Pr ~*..^ r 

From 10 clones (one clone forms halo, while 9 clones do 
not) as selected from the transformants that had obtained in 

10 Example 3, and also from the host, Bacillus suttilis UOT0999, 
full-length RNAs were prepared. The se were processed with a 
ribonuclease-free deoxyribonuclease, i„ order to remove the 
influence of the plasmids on the hybridization to be effected 
later on. Kert, using oligo ir as the probe, these were subj- 

is ected to Northern hybridization. As a result, all lanes cor- 
responding to the RNA of the transformants gave detectable 
bands, but no band was detected on the lanes corresponding to 
the rna of the host. 

20 Advantages of the Invention 

According to the present invention, provided is a double- 
stranded DNA molecule with any desired cohesive end or ends. 
Using this, it is possible to obtain various DNAs with various 
base sequences which are substantially apart from the 
25 naturally-existing base sequence spaces, and also a DNA pool of 
a mixture of such DNAs. through simple processes. Therefore, 
it is possible to obtain excellent genetic products, such as 
proteins and enzymes, which could not be obtained in 
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conventional methods and which were not examined by organisms 
in the past. In addition, according to the method of the pre- 
sent invention for producing a nucleic acid pool, it is 
possible to obtain a mixture of nucleic acids while optionally 
5 shuffling the constitutive blocks at random in the intermediate 
parts but fixing the terminal sequences to be predetermined, 
desired ones, and it is also possible to shuffle the 
constitutive blocks without changing the amino acid sequence 
which each block codes for. Therefore, as compared with a me- 
10 thod of producing a completely-randomized nucleic acid pool, 
there is a high possibility that useful genetic products can be 
produced according to the method of the present invention. 
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Sequence Listing 

sequence Numbers 1 

Length of Sequences 1122 

Type of Sequences Nucleic Acid 

Number of Strands: Double-stranded 

Topology: Linear 

Kind of Sequences Genomic DNA 

Sources Bacillus NKS-21 (FERM BP-93-1) 

Characteristics of Sequence: 

Code Indicating Characteristics: Sig Peptide 

Existing Site: 1 , . . 93 

Method of Determining Characteristics: s 

Code Indicating Characteristics: Mat Peptide 

Existing Site: 104 . . . 1112 

Method of Determining Characteristics S 

Sequence; 
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30 



25 



20 




40 
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TAC TCC GAT GTT GTT CAT CGT CAA GGT TAC TTT GGG AAC GGA GTA AAA 384 
Tyr Ser Asp Val Val His Arg Gin GXy Tyr Phe Gly Asn Gly Val Lys 

15 20 25 

GTA GCA GTA CTT GAT ACA GGA GTG GCT CCT CAT CCT GAT TTA CAT ATT 432 
Val Ala Val Leu Asp Thr Gly Val Ala Pro Hie Pro Asp Leu His He 

30 35 40 

AGA GGA GGA GTA AGC TTT ATC TCT ACA GAA AAC ACT TAT GTG GAT TAT 460 
Arg Gly Gly Val Ser Phe lie Ser Thr Glu Asn Thr Tyr Val Asp Tyr 

45 50 55 

AAT GGT CAC CGT ACT CAC GTA CCT GGT ACT GTA CCT GCC CTA AAC AAT 528 
Asn Gly His Gly Thr His Val Ala Gly Thr Val Ala Ala Leu Asn Asn 

60 65 70 

TCA TAT GCC CTA TTG GGA GTG GCT CCT GGA GCT GAA CTA TAT GCT GTT 576 
Ser Tyr Gly Val Leu Gly Val Ala Pro Gly Ala Clu Leu Tyr Ala Val 
75 80 85 90 

AAA GTT CTT GAT CGT AAC GGA AGC GGT TCG CAT CCA TCC ATT GCT CAA 624 
Lys Val Leu Asp Arg Asn Gly Ser Gly Ser His Ala Ser lie Ala Gin 

95 100 105 

GGA ATT GAA TGC GCC ATC AAT AAT GGG ATG GAT ATT CCC AAC ATC AGT 672 
Gly He Glu Trp Ala Met Asn Asn Gly Met Asp lie Ala Asn Met Ser 

110 115 120 

TTA GGA ACT CCT TCT GGG TCT ACA ACC CTG CAA TTA GCA CCA CAC CCC 720 
Leu Gly Ser Pro Ser Gly Ser Thr Thr Leu Gin Leu Ala Ala Asp Arg 

125 130 135 

GCT ACG AAT CCA GGT GTC TTA TTA ATT GGG GCG CCT GCA AAC TCA GCA 768 
Ala Arg Asn Ala Gly Val Leu Leu He Cly Ala Ala Gly Asn Ser Gly 

140 145 150 

CAA CAA GGC GCC TCG AAT AAC ATG GGC TAC CCA GCG CCC TAT GCA TCT 816 
Gin Gin Gly Gly Ser Asn Aon Met Gly Tyr Pro Ala Arg Tyr Ala Ser 
15 5 160 165 170 

GTC ATG CCT GTT GGA GCG GTG GAC CAA AAT GGA AAT AGA GCG AAC TTT 864 
Val Met Ala Val Gly Ala Val Asp Gin Asn Gly Asn Arg Ala Aon Phe 

175 180 185 

TCA AGC TAT GGA TCA GAA CTT GAG ATT ATG GCG CCT GGT GTC AAT ATT 912 
Ser Ser Tyr Cly Ser Clu Leu Glu He Met Ala Pro Gly Val Asn He 

190 195 200 

AAC AGT ACG TAT TTA AAT AAC GGA TAT CCC AGT TTA AAT CCT ACG TCA 960 
Asn Ser Thr Tyr Leu Asn Asn Gly Tyr Arg Ser Leu Asn Gly Thr Ser 

205 210 215 

ATC GCA TCT CCA CAT CTT CCT GGG CTA GCT CCA TTA CTT AAA CAA AAA1008 
Met Ala Ser Pro Hie Val Ala Gly Val Ala Ala Leu Val Lye Gin Lys 

220 225 230 

CAC CCT CAC TTA ACG GCG GCA CAA ATT CGT AAT CGT ATG AAT CAA ACA1056 
His Pro His Leu Thr Ala Ala Gin He Arg Asn Arg Met Asn Gin Thr 
235 240 245 250 

CCA ATT CCC CTT GGT AAC ACC ACG TAT TAT GCA AAT CGC TTA CTG GAT1104 
Ala He Pro Leu Cly Asn Ser Thr Tyr Tyr Cly Asn Cly Leu Val Asp 

25S 260 265 

GCT GAG TAT CCC CCT CAA X 122 
Ala Glu Tyr Ala Ala Gin 
270 272 
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Sequence Number: 2 

Length of Sequence: 20 
Type of sequence: Nucleic Acid 
Number of Strand: Single-etranded 
S Topology: Linear 

Kind of Sequence: other Nucleic Acid, Synthetic DNA 
Sequence : 

CATTTTAGAA TTCGCAGCGC 

10 Sequence Number: 3 

Length of Sequence: 25 

Type of Sequence: Nucleic Acid 

Number of strand: Single-atranded 

Topology: Linear 
15 Kind of Sequence: Other Nucleic Acid, Synthetic DNA 

Sequence: 

CCCGATTCCT TAAAGCCCTC AATAA 
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25 



Sequence Number: 4 

Length of Sequence: 17 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-atranded 
Topology: Linear 

Kind of Sequence: other Nucleic Acid, Synthetic DNA 
Sequence: 

ACAGTCTGTG CAATTTC 

Sequence Number: 5 
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Length of Sequence : 17 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 
5 Kind of Sequence i other Nucleic Acid, Synthetic DNA 

Sequence : 

CAAATTCCAC ACACTCT 

Sequence Number: 6 
10 Length of Sequence: 20 

Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
15 Sequence: 

CCTTGCGCAA TCCCTTATAT 

Sequence Number: 7 

Length of Sequence: 17 
20 Type of Sequence: Nucleic Acid 

Number of Strand: Single-etranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 
25 CCCAATACCC CATATGA 
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Sequence Number: 8 

Length of Sequence: 17 
Type of Sequence: Nucleic Acid 
5 Number of strand: Single-stranded 

Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

TCATATGGCC TATTCGG 

10 

Sequence Number: 9 

Length of Sequence: 20 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-atranded 
15 Topology: Linear 

Kind of Sequence: other Nucleic Acid, Synthetic DNA 
Sequence : 

GTGGCTCCTG GACCTGAACT 

20 Sequence Number: 10 

Length of Sequence: 16 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

25 Rind of Sequence: other Nucleic Acid, Synthetic DNA 

Sequence: 

TCTGATCCAT AGCTTG 
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Sequence Number: 11 

Length of Sequence: 16 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
5 Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

CAACCTATCG ATCAGA 

10 Sequence Number: 12 

Length of Sequence: 20 

Type of Sequence: Nucleic Acid 

Number of strand: Single-stranded 

Topology: Linear 
15 Kind of Sequence: Other Nucleic Acid, Synthetic DNA 

Sequence: 

CTTGAGATTA TGGCCCCTGG 

Sequence Number: 13 
20 Length of Sequence: 17 

Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
25 Sequence: 

TGAGCCGCAT ACTCAGC 

Sequence Number: 14 
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Length of Sequence: 17 
Type of Sequence: Nucleic Acid 
Number of strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DMA 
Sequence: 

GCTGAGTATG CGGCTCA 



Sequence Number: 15 

Length of Sequence: 20 
Type of Sequence; Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: other Nucleic Acid, Synthetic DNA 
Sequence: 

TAATCCCTAA GGATGTACTG 



Brief Description of the Drawing 

Fig. 1 ie a graphical view showing one embodiment of the method of 
20 the present invention for shuffling a " DNA . 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
„ (PCTRule 130/s) 



A. The indications made below relate to the microorganism referred to in the description 
on page 24 , lines 12-15 to page lines 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional sheet | | 



Name of depositary institution 



National Institute of Bioscience and Human-Technology, Agency of Industrial Science and 
Technology, Ministry of Internationa! Trade and Industry 



Address of depositary institution (Muting postal code and country) 

1-3 Higashi 1-chome t Tsukuba-shi, Ibaraki-ken, Japan 



Date of deposit 



7 May 1985 



Accession Number 



FERM BP-93-1 



C ADDITIONAL INDICATIONS (leave blank if not applicable) Th.s information is continued 



on an additional 



□ 



Until the publication of the mention of grant of a European patent or. where applicable, for twenty years from the 
oate of filing if the application has been refused, withdrawn or deemed withdrawn, a sample of the deposited micro- 
SUKTSTrif ° n!y t0 66 t0 an ^Pendent expert nominated by the person requesting the sample (cf. Rule 

o!«?h!T K^f !V ar as Austra,ia * concerned, the expert option is likewise requested, reference being had to 
Regulation 3.25 of Austral* Statutory Rules 1991 No 71. Also, for Canada we request that only an independent 
»*pen nominated by the Commissioner Is authorized to have access to a sample of the microorganism deposited. 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for at designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (ieava blank if not appticabte) 



J t£l**r*12 % J!?5i ^'owwillbo^bmitted to the International Bureau later (specify the general nature of the indict 
oons e.g., Accession Number ofDepoaV) 



For receiving Office use only 



This sheet wu received with (he international 



Authorized officer 

C ^' = *r}le^^ 
HsadCta* 



Form PCT/RO/134 (July 1992) 



For International Bureau use only 



| ^ | Th'» sheet was received by the International Bureau 



Awhoroed officer 
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CLAIMS 

1. A DNA with a cohesive end comprising (a) a double- 
stranded DNA having the sane sequence as that of a part of a 
gene, and (b) a single-stranded DNA having a base sequence that 
s exists on said gene at the site not adjoining the part 
corresponding to said double-stranded DNA or a base sequence 
which said gene does not have, wherein the single-stranded DNA 
is linked to either one end of the double-stranded DNA to form 
a cohesive end. 

10 ?• A DNA with cohesive ends comprising (a) a double- 

stranded DNA having the same sequence as that of a part of a 
gene, (b) a first, single-stranded DNA having a base sequence 
that exists on said gene at the site not adjoining the part 
corresponding to said double-stranded DNA or a base sequence 

is which said gene does not have, and (c) a second, single- 
stranded DNA having a base sequence that exists on said gene at 
the site adjoining the part corresponding to said double- 
stranded DNA, wherein the second, single-stranded DNA is linked 
to said double-stranded DNA at one end corresponding to said 
20 adjoining site, while the first, single-stranded DNA is linked 
thereto at the other end of the complementary strand opposite 
- to said end, thereby forming cohesive ends. 

3. The DNA with a cohesive end or cohesive ends as 
claimed in claim 1 or 2, wherein the single-stranded DNA has a 

25 length of 2 bases or more. 

4. The DNA with a cohesive end or cohesive ends as 
claimed in any one of claims l to 3, wherein the cohesive 
end/ends is/are positioned at the 3 '-terminal/terminals. 
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5, A method for producing a DNA with a cohesive end or 
cohesive ends, wherein a part of a DNA, as a template, and an 
oligonucleotide containing at least one ribonucleotide, as a 
primer, are subjected to DNA polymerase reaction to prepare a 

5 double-stranded DNA, then the ribonucleotide (s) is/are removed 
through enzymatic reaction or chemical reaction, and the 
nucleotide(s) remaining at the 5 '-terminal (s) of the site(s) at 
which said ribonucleotide (s) existed are removed. 

6. A method for producing the DNA with a cohesive end as 
10 set forth in claim 1, comprising the following steps a) to d) : 

a) a step of linking (i) an oligonucleotide having the 
same base sequence as that of a part of a gene DNA to (ii) an 
oligonucleotide having a base sequence that exists on the gene 
at the site not adjoining the base sequence of (i) or a base 

is sequence which the gene does not have, and containing at least 
one ribonucleotide, in such a manner that the oligonucleotide 
(ii) is positioned at the 5»-terminal of the oligonucleotide 

(i); 

b) a step of preparing a double-stranded DNA through DNA 
20 polymerase reaction between a DNA containing the part 

corresponding to the oligonucleotide (i) in said a), as a 
template, and the linked oligonucleotide as obtained in the 
previous step a) , as a primer; 

c) a step of removing the ribonucleotide from said double- 
25 stranded DNA through enzymatic reaction or chemical reaction; 

and 

d) a step of removing the nucleotide remaining at the 5 1 - 
terminal of the site at which said ribonucleotide existed. 
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7. A method for producing the DNA with cohesive ends as 
set forth in claim 2, comprising the following steps a) to d) : 

a) a step of linking (i) an oligonucleotide having the 
same base sequence as that of a part of a gene DNA to (ii) an 

5 oligonucleotide having a base sequence that exists on the gene 
at the site not adjoining the base sequence of (i) or a base 
sequence which the gene does not have, and containing at least 
one ribonucleotide, in such a manner that the oligonucleotide 
(ii) is positioned at the 5 '-terminal of the oligonucleotide 

10 (i); 

b) a step of preparing a double-stranded DNA through DNA 
polymerase reaction between a DNA containing the part cor- 
responding to the oligonucleotide (i) in said a), as a 
template, and (i) the linked oligonucleotide as obtained in the 

15 previous step a) and (ii) an oligonucleotide which is a 
complementary strand of an oligonucleotide existing on the gene 
at the site separated from said oligonucleotide-corresponding 
part by at least 3 bases or more toward the 3 ' -terminal and 
which contains at least one ribonucleotide, as primers; 

c) a step of removing the ribonucleotides from said 
double-stranded DNA through enzymatic reaction or chemical 
reaction; and 

d) a step of removing the nucleotides remaining at the 5*- 
terminals of the sites at which said ribonucleotides existed. 

8. A method for shuffling a DNA, comprising dividing a 
DNA into a plurality of DNA blocks each having a cohesive end 
or cohesive ends, followed by ligating them together into a 



20 
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sequence that is different from the sequence of the original, 
non-divided DNA* 

9. A method for shuffling a DNA, comprising applying the 
method as set forth in any one of claims 5 to 7 to various 

5 sites of a DNA, thereby dividing the DNA into a plurality of 
DNA blocks each having a cohesive end or cohesive ends, at 
least one block of which shall have a cohesive end that is 
complementary to the cohesive end of another block not having 
been directly adjacent to said one block on the original DNA, 
10 followed by ligating them together into a sequence that is 
different from the sequence of the original, non-divided DNA. 

10. The shuffling method as claimed in claim 8 or 9, 
wherein the DNA is divided into 3 or more blocks. 

11. The shuffling method as claimed in any one of claims 
15 8 to 10, wherein the blocks are ligated together using a DNA 

ligase. 

12. A DNA as shuffled according to the method as set 
forth in any one of claims 8 to 11. 

13. The DNA as claimed in claim 12, wherein a gene coding 
20 for an enzymatic function or a control gene for the gene is 

shuffled. 

14. The DNA as claimed in claim 13, wherein the gene is a 
gene that codes for any one of proteases, lipases, cellulases, 
amylases, catalases, xylanases, oxidases, dehydrogenases, 

25 oxygenases and reductases. 

15. The DNA as claimed in claim 13 or 14, wherein the 
gene is one derived from prokaryotes. 
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16. The DNA as claimed in claim 15, wherein the gene is 
one derived from bacillus bacteria. 

17. The DMA as claimed in claim 16, wherein the gene is a 
protease API21 gene. 

s 18. A DMA pool containing plural kinds of DNAs having 

different structures that are obtained according to the 
shuffling method as set forth in any one of claims 8 to n. 

19. The DNA pool as claimed in claim is, which contains 
10 or more kinds of DNAs. 

" 20 ' A , nethod for producing a DNA pool, comprising 

applying the method as set forth in any one of claims S to 7 to 
various sites of a template DNA to thereby prepare a mixture of 
DNA blocks each having a cohesive end or cohesive ends that 
satisfies the following conditions, followed by ligating these 
15 into any desired seguences: 

Condition l: Each block has a double-stranded site having 
the same sequence as that of a part of the template DNA. 

Condition 2: At least two of the blocks that constitute 
the block mixture further have, in addition to said double- 
20 stranded site, a single-stranded site (cohesive end) that is 
complementary to the cohesive end of blocks that are not 
directly adjacent to said blocks on 'the template DNA. 

Condition 3: The block mixture contains at least two 
different blocks which are the same in the double-stranded site 
25 but are different only in the single-stranded site and which 
satisfy the condition 2. 
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21. The method for producing a DNA pool as claimed in 
claim 20 , wherein the template DNA is a gene that codes for an 
enzymatic function or a control gene DNA for the gene. 

22. The method for producing a DNA pool as claimed in 
5 claim 21, wherein the template DNA is a gene DNA that codes for 

any one of proteases, lipases, cellulases, amylases, catalases, 
xylanases, oxidases, dehydrogenases, oxygenases and reductases. 

23. The method for producing a DNA pool as claimed in 
claim 22, wherein the template DNA is one derived from 

10 prokaryotes 

24. The method for producing a DNA pool as claimed in 
claim 23, wherein the template DNA is one derived from bacillus 
bacteria. 

25. The method for producing a DNA pool as claimed in 
15 claim 24, wherein the template DNA is a protease API21 gene. 

26. The method for producing a DNA pool as claimed in any 
one of claims 20 to 25, wherein the DNA blocks are ligated 
together using a DNA ligase. 

27. A genetic product to be obtained by expressing the 
20 genetic information on DNA molecules that exist in the DNA pool 

as set forth in any one of claims 18 to 26. 
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Fig. 1 
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